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HIER-ORP: A c6mPUTER PRO-AM FOR THE 
HIERARCHICAL GROUPING OF REORESSION EQUATIONS 



I. INTRODUCTION 

HIER-GRP, art acronym for hierarohteal pouplng, is a computer program which was divdofid for 
various Air Foro^ r0S€arth purposes at the Cbmputatlond Sciences Di^ior^, Air Force Human Resources 
Laboratq^y, Brob^ APB| Texas. Given a starting set of k regression equations, eaoh of which contaiitt the 
same criterion i^d predictor variables, the bade objective of the HSR-G^ algbritfun is to poup or to 
cluster the equations in a stepwi^ or iterative manner so as to minimiza the overidl lo^ of predictive 
efficiency ^t/aach iteration. Initiaily there are k separate groups; i.e., each of yit k equations is considered 
as a group by iyilf, and^ a measure of overaU predictive 'efflclen^ is computed.- At the flrit iteration^ 
poisible ways of combining any two of the equations from the totd k equatioj^ are examined, and that 
copbination providing the minim um^oss of overall predtojgMfflcienQ^ is selected to form a "new poup.*' 
Fprmatlori of the new group reduces the number of equjHyif to k-1 for tfie start of the second IteraUon* 
The procass continues until only one flnal poup remains ffld is "hlerarchlcil*' In Ae sense that Uie pattern 
of the number of groups from start to flntah is k, k-1 ^ k-2, . . ^ 1 ^ ^ < ' 

The mathematical tiieory upon which HIER^GRP is ba^ is documented In an Air Force publication 
e^ititled An Iterative Technique for Clustering Criteria WHwh Retaim^Optimim ^^ictive Efficiency by 
Robert A> Bottenberg and Raymond E. Christal (3). Early developmental work was also accomplished'by 
Jcfe'H. Ward, Jr.^ ((6)* and some of the original jpo^amming was done by Daniel D, Rlgney, 

HIER-GRP or dne of the earlier versions of the propam has been used extensively by the Air Force in 
the past, especially in conjunction with "policy^capturing applications/' Policy-capturing ii^^methodcrfoB^^ 
, compoiad of multiple linear regression analysis hierirchldal grouping procedures (1,3, 4, 6,7, 14, 16, 
17, and 18)* In ihiB context, HIER-GRP was used In the development of the Weighted Airmm Promotiort 
System (WAPS) (10) and later in the reevaluation of WAPS (12 Ind 13). THe program was also used in 
developing officer grade requirements (9), a promotion system for airman basics (2), a scraening system^ for 
the Air Reserve Forces (8), and a senior NCQ promotiDn system (II). 

This report describes the tedinlcal details that are required for the use of the HIER-GOT program as 
it is currently operational on the tJnivac 1108 computer system at the Computational Sciences Division. 
TTie basic algorithm is first discussed, and the essential steps are outlined. Details of the computer system 
requirements and descriptions of necessary control cards are thfn presented. Next, the output of 
HIER-GRP js explained. Appendiees are included that contain the mathematicd formidas used in tfie 
program, some mathematical backpound helpful for understanding the algorithm, sample output, and a 
compete source card listing of the program y ^ 

Partly as a result of the* research studies referenced above, rfequests for copies of the HIER-GRP 
computer program and associated documentation from different Air Force agencies, other governmCTtal 
organizations, colleges, and universities have been numerous^ Since 1969, approximately ^enty copies of 
HIER-GRP have been provided to different requesters and implemented on a variety of different coitiputer 
systems. One purpose of this report is to provide ^ document whicK can be used to satisfy any future 
requests for HIER^GRP. - 



II. BASIC ALGORrTHM . / 

This section describes the basic structure of the HIER-GRP algorithm, Tl\a-re^i#r is referl^d to 
Appendix A for computational formulas mentioned in the various steps and tb/ Appendix B for.Viore 
detailed mathematical con^de rat Ions, ^ , - ;^ 

. - ^ ■= * ■ : ' • • ■ ^ '\ 



The basic itips of thf HIER-GRP algorithm can be summarized as the following five phases i (a) data 
Input and pmpm termination, (b) computation of the overlap matrlK, (c) determination of the order of 
duitering, (d) computation' of the statlstlci for the initial k criteria, and (e)Jterfftioft to reduce the number 
of dfiteria. Each of these phases i§ described in the followLng steps. The steps are to bo followed In numeric 
order unless Indicated otherwise. / ' ' / . ^ ^ , 

' ; ■, i \ i 

Steps 1-2, Data Input and PrQjmm Termination 

1. Read *Troblem Definition' Card ^* This card defines k, the number of criteria or regression 
equitions to be grouped and the num^r of standardised regression (beta) weiglits In each equation* If no 
Problem Dermition Card is read, terminate the program. - 

2. Read in ths number of gases, the criterion means and standard deviations,. the standardized 
regression weiglus, the validities, and the predictor means and standard deviations for.cach equation. Am\m\ 
each equatioft the identincatlon numbers 1 through k/rcspectivelyraccording'to the order In which the ) 
equations were read. , ' 

^ep 3. Computation of the bverlap Matrix. 

: - . = \ — ~ ' rj—- ^ 

3. Compute the^overlap matrix A, where each element a|j denotes the decreaie in overall predictive 
efnciency if equation i is combiricd with equation j; for i - 1» . . k, j ^1,2, . . k. And i # j. The 
dUagonal elements of A arc undefined and the elements above the diagonal are symmetric with those 
elemerits below tile diagonal. - ^ * 

i ' ■' _ . ^ ■ 

Steps 4-=8. Det ermination of the Order of Clustering 

^ j ■ ■ ■ ■ ^ ■ 

4. 'Set NGRPS, the Intlex denoting the current number of grou[«,'equal to k. Initially each criterion 
(equation) belongs to a separatejjluster, , « . 

5. Cbnsidcrhig all clusters presen) at the NGRPS stage, select two of the clusters denoted by i andj . 
such that-' ' ' , , , I 

a- < agj^ where Q and m are the identincation numbers of any cluste'^^resent at the NCRre 
stage and 8 # m, and \ 

b. i < j. This can be accbmplished by^ainining the elements above the diagonal of the overiap 
matrix and selecting the smallest element. - * 

6. Form a new criterion cluster from the old clusters i and J identified in Step 5. Record the 
identifications of the two clusters i and j in the storage areas iUjs^Q|y>g and JUj^^Q^pg^ respectively. Am^ 
the new cluster the identification number i, ^ . . 

7* Decrement NGRPSby K If NGRPS > l,go to Step 8; otherwise proceed to Stcr9, 

8/i Update the overlap matrix as follows. For each fi, # i Qf;Step 6 where g is the identification 

number of a criterion cluster preFCtit at the NGRPS stage, compute the decrease in overall predictive 

efficiency if eqifetinn K is combined with equation i. Since NGRPS was reduced by 1 in Step 7, the 

dimension of the updated overlap matrix wili be reduced by 1 , Return to Step 5. 

^ . " " """"" " ■ / ■ 

Step 9. Coftirtftation of the Sta tistics for the Initial k Criteria 

9^ Compute the squnrcd iuultiple correlation coefncient for each of the initial k reqression equations 
and, also, ORU|^, the overall squared nuiltlple correlation coefficient obtained by considering a regression 
model with no grouping of initial equations. 



' Stipl 10^1 Iteration to Reduce the Nuffibef of j&ltefto . 

10* Form 1m Initial grouping of the k equations by asslplng each' equation to a group by Itself. Tills ^ 
Is the **k groups** stage. Set NGKPS equal to k. ^ " ♦ - , 

" II. Form a new grouping of tlio k equations by following the grouping order established In Steps* 
4--8. ThJs Is acqompllshfid by combining the f^oups Identiflod by lUj^Q^ps and jl%jKPs and nsslplrjg, 
the new group (criterion cluster) the Identincatlonnumba ^ 

12. Compute the least Squarcs'regrefslon equation which can be used to predict the new group and 
dt!crimentNGRPSby 1. 

13. Print all stJllstlcs concerning ifie now grouph^^ " ^ { 

a. the Idcntincation numbew of the two equatlonB combined at this iteration, ^ . 

b. ,An F value tcstirtg the difference between the prediction equations for the two clusters in, 
(a), . ] , ^ 

c. An F value letting the difforencc between the k initial prediction oqtiations and thi smaller 
set of nORfS equations (on^ Ibr each cluster) used at the ;'NCmre groups" stage, and 

d. * the overall squared multiple Correlation coefflclent obtairjed using the NGRPS oquatlons at 
thisstageV . , ; ^ . ^ 

14, Priat a summary d( all groups (clusters) present at the NGRPS stage, Also, print the prediction 
equation for tlie new group (including standardised and raw set)re wciglits). - „ ^ ^ 

15, If NGRPS > I, loop back to Step 1 1 ; othcrwUe, rotum to Step 1 and be^n the next problem. 

, , IIL i)|(SCRimON3bFTrilUimR<iRPPROqBAM ; 

Systems Requirements ^ j ^ 

Tlie HIFIWJRP program Is compoRcd of seven routlnes=the main or driver routine and six 
subroutines. Tiie entire program, with the exception of the Uhivac A_ssembly Langijage subroutine START, 
is written in FORTRAN IV. Tlie assembly subroutine START is called once at the beginning of the driver 
routine and is never called again. Its only function is to' reset the margin control on the Univac 1108 
printer, * . . j 

The Univac version of FORTRAN has a special statement, tlic Parameter statement,%hlch is used in 
the driver routine and which may not be availablc^)ii other computdrs. The Parameter statement is used to 
definrtWHlimcnsions ofarri^H at compilation time. The Parameter statement can be r8(noved4f each array 
is dimensioned to its required sUe, - \ ^ ^ 

*rhe complete IIIFR^GRP prpgram requires approxitiiatcly 10,000 36-bit words of core, storage in 
addition to the number of words required lot arrays. If P is the number of prqiictors atul \i is the number 
of equations, then the amount of storage rcquirud for arrays is 12U+3P+[2'F*P] + [E*(E^iy2l + l4, l or 
example, if P * 50 and B ^ 50, then 6,989 words of sturage arc roquircd for arrays," . 

There arc a totjil of 1,121 cards in tlic IIIER-GRP prtmam; deck. Of these, only 601 arc source 
mguage cards apid the rcmmndcr are comments cafdr^TMnumber of cards and the intrinsic system 
routifies required in each of the seven routines which make up HHiR-GRP are listed in Table 1. ^ 
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ERIC 



Table I, CI 




iterktici of the HIER^jltP Roulinei 

4 ^ . :. - .1 



Nanit 



Numbtr of l§uri« 
Ltnfuait CiNi • 



P^IVHR(MAIN) 

STA$T 
OVRLP 
GROUP 

STAGE^ 
PRINTG 
PLEVEL 



FORTRAN IV 
ASSEMBLY 
FORTRAN IV 
FORTRAN IV 
FORTRAN IV 
FORTRAN IV 
fX)RTRAN IV 



100 

7 
36 
76, 
HI 
218 
83 



NymMr of 



311 
0 

36 
48 
42 
^2 
1 



^ None 
None 
Nona 

Nona ^ 
' None 
SORT 
ATAH SORT, 
AUX;,liKP,SIN 



. f : Data Rjquirenienl^ \ ' 

AHIER'GRP user must supply the Inllowinp, data for each rwession c(|uation: 

1 , The number of cases (N) which were used tn cninpute Ihd cc|uatinn 

2, ilie criterion riiean and stanciarcl devnn^ ' *w 

3, Tlifi i tan d aid i/.ed rcgfC isi cm weigh tB , i 

4, The validity coefncicnts (correlations of predictor or indeperident variabtes with tlie criterlcjn or 
dependent variable) 

5, Tlie predictor meiins arid standard deviation.^, ^ ' 

The corhputational formulas developed by Bottenborg and ChriMal O) and used within the program 
assume that the predictor sunwjNquares and cross products matrlues are proportiunht i e., that the ratios 
of the corresponding elenicnts of tlic simis<)fm|ian^ arid cross products matrices for any two equntions to 
be clustered arc equa^ to tJie ratio of the corresponding numbers oT tltc cases within each eqtuitioii. Tins 
Assumption of proportionality is discussed in dfetail by Bottcnberg and ChrisiaJ (I^ftl, sec pnges 8 througli 
1 1) and ulso addressed in Appendix B (sec equation 9b) of this report . In practice |his assuniptinn is met by 
selecting items'(l) and (5) of the previous par^Kraph to be identical for each c{|iin!ion. 

Run^tream Orpni/atinn ' ^ . , 

Tlie following card seqtience is required to use the IIIli^^GRP prbgram as it is operational on a 
Univac 1 108 computer; . ^ 

Order Card fype ' 

\ , ('^'RHN . 

2, * (^^^XOTT*T.HIUHGRp 

3, Problem Delinition Card 

4. Header Card(s) 

5. Tor mat Card for E<|ualiori Ns 
(y. Data Card(s) - Equation Ns 

, 7. I^ormat' Card tor Criterion Means ujid Sl)| 

\8, Data Card(s) - Critcrinn Means and SI>s > 

9, Fornint ('nrd for Beta Weights 

fO. Data Card(s) Bc^ Weigjits 

,11. Fnrniat Card for V^litlcs / 



12. 
13. 
14. 
13. 



16. 
17. 



DitiCifdW-Validitlfi 

Porniil C«rd for PridlctoT Maim Mid SDi ^ 

XNtl Cird(i) - Pndictof Miifii ind M 

Tilt i^qUfn^ of difdi 3. 14 Is requlfed for e^ch run . 

Ai ipiny pfoblimi ai dtrfred \my hi tm by iUcklnlunt 

pfobltiti after another. # 

Blank'Cird to Tirminilf Kuti 

(»FIN 



The Unlvac 1 lOS Syitero Cards (1 , 2, gmd IT) ire deicrihfd In the Unlvic Bnac Reference Manual 
(l5),4>«criptlona of ca^di 3 16 are preicntctl In the next ledlon, Sec AppetidlM C for siiiip*e furi-itream 
and aampie control cardi. ' 

C'f>ntr*>l CafOa 



ProMeni Utflnition ( ard ^ 

Card FORTRAN 
.Columni F'urfTiat 



U3 



4 6 



II 



11 



II 



NiM<ri>s, 



IHI AI), 



10 80 



Dcsuripfion 

{hd number uf critiiia (lyilima, riffw^ 
Nliiafiuni) In thta profilem. NMQS must be , 
less tlian or equal to 50. 

Ilic nurnticr of beta wclgliti (itandafdi/cd 
regression welghta) In each cc|iiatlon. 
NrRliDS mnit be k%% than or Ci|ual In |0(). 

the pjoufHng (Liustering) nptlon deMred. 
NorifiiiHy i *'(^^ Is specined which cau^e* 
the jp^ouplng to be done haicd oil the 
Itcnitivc tcchnic|ue dcvclopcU by LkittcnHcfg 
niulCliri^tRl 0). Other opt iuns ire 
Incltulcci in the program and conirnents cards, 
hii! lire fnr Intnrc ticvelnpniental piirpos*'^ only 

the fujrobrf of header (label, jiilc)cafd^ 
thnt hdl<^w thia control card. Ileadar ^anii 
nut be omitted (Nllf)HS ^ 0) or up in 9 carcU 
inny bo spetificiL ' 

thrduiM rend opticiii. tHI\AI) ^ 0 infiins icuil the 
beta' weiHht^ and validiiies NPHhUH iteiru bI 
a time, lUrAf)- I means rend thrm NKQS'KPKIIW 
itciiis at n tinic. IRFAl) alhiw% nejcitiility in 
the tornint <»r input data However, (Hl- AI) H 
fiurnially set cquaJ t() zero. 

TJies*^ ciird culutnns arc «ot fciid. 



Haider ( ardu 

ilach header card will he 
ticudcr curds niust he presunt 



prinl^Ml nwh' niKc at tlic hegijiiiirig (d the j^ruupirig repoit, Exactly NHDHS 

J' ' 



Porntif Mtf tiMa Cifdi ' 

J, Rphrml :Cml for Fqimkm m. X\\\% cird tnppliei the l (iKTHAN vsiMhlc ffirttiai hy which the 
numbtr nf Mwiuied Iff the ciYrnputiilufi nf eflch fquailnri ii tn hi retcl Only the V and H edUlnp cckIjiii are 

2. V^ftf Cmi(i) Equmkm St Theii cariU Af^d itccjfdini the pffvluui ftirmat carcl. The 
nfimhcr of cinhi requlfed clf fiemli nh lh<* ffifmwt i|ipd^Rkirn= 

* 3. Ftmnai Cmtl for (ytimin Mennh ami Sth This card pHividw llic rORTHAIH v^rUhle fnrmaf liy 
which the cfherinti mtan iinti stamlird ilfvialiDn for ^adi cijURtinn art In he r^acl. Only tlic? I atiU X cdithig 
cofJe^are pennHted ^ , » ^ 

4. /J^ffl Oihifi) (yifrhun Mram fttul SjH Ihr^e rar«U arr read j^cnnlinK tfi llir prf^vtrMii ffirnial 
i:afd The niimhcf *>f c ardi rr»(uir^(l (lepriuU the fnnnat ^jiftinraliofi^ * . 

5. lomat ( ani far Ikia Wrighfi IhU lafil ^uppfjri th^ I nK I Ki^\N^priahlf frirnial hy wliicfi thr 

wejpjUs pff rijliatinn) arr^t<i hr Tfa<l * hily thr 1 anil X fdlliii^ tode^ are iwrrniltrd 

f», Daia ('aNf^j lUta Wrighh Ih^^f \^n\% arc read atcnrthng t<i llic- prr vliJ»iU f orrnaf ,i Rrd^ Iv^^attly 
Nl C)S irt* nf J ard^ Atf rrijuirrd it IHI Al) - 0 llip lint i unimm thr hrfff "wdghl^ f«ir ri|uati(»n 1 . \\\e 
%ft'<Hul ^1 ccHjfalrH'tho hr\^ wripjit^ for ri|ii^tif)!i .\ hihI nfi Ihr fiiMnhrr i)f (afds wifliin raili %c\ 

/. lifrm^ Canl'p>r Viiikiiiir% Ihi^ i aril pffividn^ iFtr* lOK I M AN v,yial)k Oiniiar hy whirh flir 
Validity Lnelfli i^nU (ni cath cijiiaiinn arr read Uidy tlir I and X editing c udc^arc^ prnniUrd 

8. f)at{i C'ani(%) ydliiJiitr^ \ \\c%r raivU arr read ^ nrding t<» the prcvloiM. fnTinal * ard Ivxaitly 
Nl C)S %et% f?f tardvarr rftjuiird vf IRrAr? i) Utn fif^t ^rt" t ontaifH tlir valiilitic^s fur c(pia!ioii h tlie 
^fc^nul uri innt.iln'i ihr validitir^ htr fipiiiiMPU ;in»!.^ij !»n IliC nund^cr «d car<U wjihin natli %c\ drprndi 
un liie fc>rmHlip€tlfkatJnfi5 

Itirmai (ml for fW^iirtar Ufv/fif anil SIh \\m L.ird «upp!if^ihc fOW IHAN Viirinbir forniaf hy 
wiiirh iUr^ pf^ditlOf inriifii rtfiil ^hifuUfd ilrv^hilinrH Inf rii* Ii r<piatiiiii arc la hc.rrmi Ortly tlif I aild X 
fditing endr^ Hfc p**ri!ntff d ' ' * 

tCI /Jd/if ('iirjftj/'^ l^Vilut'^r \Uiin% am! Sf)s \Ur%c kahU arr read iiicnrdinp ihr prrvjiitis forrnal 
card. Ihc nunihcf ol Lards.irtpjned dcpciuh Iho lorinat ^pr^ ifirnlutm 

(hifpiif / ^ 

: IIm: prinicd Mufpul ni fill IM.I^P is ilivji|r<l mfn livr paiH tlm fnofi(ii^!iifii and vrr^iiui dale, Ifir 
tufUrul ( .ud pnf.MUrhMs. llir pfnhhMii Jir.hlrr hOirl, ilu' loiMiat aild input <lii!;i iaf<U, mi\ !hr (fitrfuni 
Hfoupin^ ffHiiliH r.uli nl ilivishui*. K tlr^t fjht'il ifi tlu- lijllnwiii^' par j^U.ipi'^ \(r\cT lo Apprfi<iix C'lur 

* • " r ^ 

Monogfiim aful Vrrsinn Ihitr * - 

Ihr pftip^iinn titlf^ "lf!<N,iiilu».-4l f'tniipifi)* l*fi'Ki.Un llli H ^iKP/' llir Al MK I rfinno^ram, .ifid thr 
prngrarii vcfsinfi date nre pfifitril .\\ [hr hct^infiifii^ <»i '^'kIi ptntilctM Ihr pr<>K''"" versiini date* n the la^i 

titru^ Ihr pfnpr.iin W.H U'^'d-llrd nf Mln<litiri| s' 

. . ' ' ■ 

<^)ntnjl(\nrii Pnrafiirten 

' Ilic pararticters ^pc^t di^^d ou ihe l^nihli'ni l)*Mifil(i*Mi i..if<i ;ifr pnntrd (inijrr tlie lioatliii^'; ( 'ON I KOI 
^^AHI) I'AKAMrii KS. f^iis itu hulrv- thi' niinihci nt ro^-irsMnfi iMpLitUHiS (ififi^iia). the miitllxM ni tie!.! 
In i»at li rijnatinfi. the 'Hpiji}^ npimii ilrsneil , .iiul ilir niniihfM o \ luMile^ i Mils |nf ifii^ pinhloin 



ProbliiaHiadir UNI ' . , 

The pfoblam lieider jiiftol, If h«§clcr card« werc ^s|)ccined tlie Pfoblem DellnltltMi Cnrd^ is printed 
under the heading PROBLEM HKABER LABEL . 

4 i^i^mii and Input Mil Cftfdt . ^ 

All furm«t cafdi artd alLlnput dila a^c printed under the heading FORMAT CARDS AND INPUT 
DAtA. I'Irst, lha format itaienienli uicd to read the rnintber of caws and the criterion means and standard 
devlatlnnii fnr each eqnaHori are prtntdd, A table listing the equation numbers, the fiuniber of casei, the 
critertmi ntaans, an<J the crlteriun itandifd deviatiuns is printed nextk Third, Uie forftiat statement used to 
read the beta weights And n table listing the e<|uati(m nunibcr and the beta wel^ts (15 per line) for each 
equal Ion are printed, roiirth, the forinat statement used to read the vaHdi4y coefnciants, arid a table Hitlng 
the equation nuniber and tile vallditlis (15 per Ifne) for each equation are printed, t'lrrallyg the format; 
itatement used to read the predictor rncnns and starulard deyintiur\H and a tabic listing the predictor variable 
number ind'prcdictor means and standard deviations (one eaih per line) are printed, . 

, C rittrion (rmuping Results 

The results uf the clustering proce^^ arc printed ^indej the jilcading HIJ^RARCinCAL GROUPING 
RESULIS^ Hif output in this divisicui can besejsarnted into three parts - the gwupiiig option description, 
the R'iKfpiare (HSO) sununary lor the NKOS initial criteria, and the results of caoh iteration. Bach of these 
icLiiqus is dcieilbed as follows. 

]. (trmipim Optujn i)vsi:rii)tii)t\. Ihi: ivnuping (»ption and a verbnl description of the grouping 
optiun specined ()rf Oie Pr<)blenT l)eliniti()n ( a ^ 

HHQ Summim\li}r the NKQS Iniiial VrUvria. The number, NI^QS, of initial criteria; the overall 
RSCJ, QHlJjgj Qj^; acliieved by using tlie j?cta weiglit.^ specined on the input data cards; and a table listing 
flic equation nunibcr and thV USQ lor cnch e(|uation are printed. ' 

[Hi'stiim of iMih Uvmthm. \\\m statistics and tables printed at each itfcrationn=^ * the information 
primed huiow each row of n^icrisM in listed us the folhiwing in 'rnhic 2. * 



Table J. Output for Each Iteration 



Csmputar Output 
Uib«l 



Meaning 



Stage « £ 

OVERALL/RSQ - ORUg 

SYSTEMS GROUPING THIS STAGR Tabic 
SYSIDENT 

NO, MEMBERS 

i 

NO.CASES 
RSQ 

DECISION VALUE 

F^TBSTEOR THE EQUALITY OF 
REGRESSiON:PARAMETERS FOR 
SYS*S COMBINED AT THIS 
STAGE Table ^ ^ 

. CHANGE FROM H\ SYSTEMS 
RSQ ^ORUg+j -QRUq 
DF^NPREDS+l 

RESIDUAL 

RSQ ^ l--ORU(j^, 

OF ^ N-(C+l)(NPR[iDS+l) 



E is tlie number orcritcrion clusters present at the end. of 
this iteration, 

Tliis is the RSQ obtained by using C equatioiis (one for 
eadh criterion cluster present at tliJs stage) to predict the 
NEQS initial criteria, 



The idcntincation (ID) riumbcrs of tlie two criterion 
clusters conibined at this iteration. 

The number of njicmbcrs in each criterion cluster. The 
ID numbers of the members of each cluster can be 
obtained by referring to the summar3f roster for stage 

e+L ^ ^ 

Tlie nuniber of cases used in the computation of the 
prediction equation for each criterion cluster, Tliis 
nuniber is the siini of the number of cases used in the 
prediction equation fo^each member of the cluster* 

The squared multiple correlation cdefflcient which is 
obtained by predicting "^^ach criterion within a cluster 
from the same conipromise regression equation. 

The loss associated with replacement of the two chistcrs 
combined at this stage. 

This table outlines a test of the hypothesis that the 
prediction equations fpr tlie two criterion clusters 
combined at this stage are idei|Scal. Equivalently, it is a 
test of the loss- in predictiv^ efficiency when each 
criterion within the two clustefs combined at this stage 
are predicted from the same compromise dquation. 



The deprease in OVERALL RSQ from stage C+I . 

The decrease in the numbcr^of pa rani etc rs estimated 
from stag%J^+L ^ 



rhe proportion of the criterion vari;mcc attributable to 
error at stage ?+l , 

The total nuniber of cases less the number parameters 
estimated at stage C+L Equivalently, DF is the nuniber 
of degrees of freedoiii associated with the er/or jxirtion 
of the criterion variance at stage C+l , K 



FSTAT ^ [(ORUg^., ^ORy^)/(NPREDS+l )1 

/ [( 1 -0RU5^,^)/(N^C+ 1 )(NPREDS+ 1 



The F statistic testing the liypothesis described in the 
pri^ceding paragraph (FOR SYS'S COMBINFl) AT THIS 
STAGE) 



Table 2.(Gmtinuvd) 



Compuitr Outout 
Libfi 



SrC LVL 



POTEST FOR THE EQUALITY Of^^ 
REGRESSION PARAMETERS FOR 
SYS^S COMBINED UP TO THIS 
STAGE Table 



CHANGE FROM NEQS SYSTEMS 

RSO-ORU|v,Eos-0^^^^^ 
DF ^ (NEQS -e)(NPREDS+r) 

RESIDUAL 

RSQ - I ORUj^|;q^ 

DF ^ N (NEQS)(NPRivDS+l) 



The probubility that a value of the P statistic greater 
than FSTAT would occur ^^y chance, A value of SIG 
LVL equal to a means that if the hypothes&'being tested 
is true, then a value of the F statistic greater than 
FSTAT would have occurred 100a percent of the time 
by chance. Therefore, small values of cr tend to reject the 
hypothesis being tested. 

Tliis table outlines a test of the hypothesis that the 
prediction equations for all mem^rs of criterion cUiste^ 
number I arc identical^ the prediction equalions-foTMl 
members of criterion cluster 2 are identical, and so on 
tor the C criterion clusters present at*the mS of this 
iteration. Bquivalently, this tests the loss in predictive 
efficiency when 2 equations (one for each criterion 
cluster) are used to predict the NEQS initial criteria 
instead of the original NEQS equations. 



The decrease in OVERALL RSQ from stage NEQS. 

The decrease if the number of parameters estimated 
from stage NEQS. 

The proportion of the criterion variance attributable to 
error at stage NEQS. ■ 

The total number of cases less the number of pafameters 
estimated at stage NEQS. Equivalently, DF k the 
number of degrees of freedom associated with the error 
porrion of the criterion variance at stage NEQS. 



FSTAT - I(0RU|^j;qs 0RU^)/(NEQS G)(NPREDS+l)l, 
/[(I - ORU|„|;Qg)/(N-(NEQSKNPREDS+l))] ^ 

The F statistic testing the hypothesis described in the 
preceding paragra^>» (FOR SYS^S COMBINED UP TO 
THIS STAGE) ' 

SIG LVL The probability that a value of tlie F; statistic greater 

than FSTAt would occur by chance. A value of SIG 
'.VL equal to a means that if the hypothesis being tested 
true, then a value of fhe F statistic greater than 
■STAT would have occurred 100a percent df the^linie 
by chance. Therefore, small values of ot tend to reject the 
hypothesis being tested, 



SYSTEMS SUMMARY ROSTliR Table 



1 



The summary roster is a listing of all the criterion 
dusters present at the .end of the current iteration. Tlie 
members and the RSQ for each cluster are also printed. 
In addition, the prediction equation iuid the system 
mean and standard dcJ^iation for the new criterion 
cluster formed at the present iteration are printed. The 
compromise equation for each criterion cluster present 
at a given iteration can be obtained by referring to the 
sumnfary roster for tile stage at whicli the cluster was 
formed. 

13 . . 



Table 2. (Continued) 



Computer Output 






Meaning , 



STAGE IDENT 
SYS LOSS ' 



NO. MEMBERS 



RSQ 



NO. CASES 



SYSIDEN^ 

IDENTIFICATION OF 
OTHER MEMBERS 

NEW SYS CRITERION MEAN 

NEW SYS CRITERION SD 

BETA WEIGHTS FOR THE 
NEW SYSTEM S 



RAW SCORE WEIGHTS FOR THE 
N^ SYSTEM S 

REGRESSION CONSTANT 



Y SINGLE MEMBER SYSTEMS 



Tlie stage at which each criterion cluster was fomied. 

The contribution of each criterion cluster to the 

decrease in OVERALL RSQ from stage NEQS. 

Eqiiivalently, this is the amount by which the 
# OVERALL R^Q would increase if each of the criteria 

within this cluster were predicted frpm their individual 
' regression eqi^ations rather than from the comgromise 

equation for ttie cluster. 

The numberpf criteria within each criterion cluster. The 
ID number^' of the nlembers of each cluster are listed 
under the hiadings SYS IDENT and IDENTIFICATION 
OF OTHER MEMBERS in this table; 

The' squared multiple correlation coefficient which is 
obtained by predicting each criterion within a cluster 
from the skme compromise regression equation. 

The number of cases used in the computation of the 
compromise equation for a (Criterion duster, This 
number /is the sum of the number of cases used to 
compute the , recession equation for each criterion 
Within the cluster, . * 

The ID number of ^criterion cluster. This is also the 
smallest ID number of the criteria within this cluster. 

The ID numbers of the remaining criteria within a 
cluster. 

The criterion mean for the cluster formed at this 
iteration. 

The criterion standard deviation for the cluster' formed 
at this iteration. 

The values (10 per line) of the least squares standardized 
regression coefficients for the regression equation which 
is the best single predictor for dl the criteria in the new 
cluster where S is the ID number of the new cluster. 
Equivalently, these are the beta weights which would be 
obtained by pooling the observations for ^ the criteria 
in the new cluster and computing the regression of the 
pooled criteria on the NPREDS predictor variables. 

The values (5 per line) erf the raw score weights for the 
regression equation whid^ is the best single predictor for 
all the criteria in the new cluster S. 

The regression cons|ant for the regression equation 
which is the best single predictor of all the criteria in the 
new cluster. . . 

A list of the identification numbers of the '*Y" single 
criteria which have not been combined with any system 
up to this stage. . ^ 
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^PP^-A^/JTyl.- NOTATION AND COMPUTATIONAL FORMULAS 

: ^-^^ ' . \ : ' ■ . 

TTie tranipose of the ^odated matrix, 
kj Th^ initid nurtibir of criteria, 
pi Tha nuniber of variables* 

fh. The number of cases used in tiie computation of the regression equation for criterion i. 

.... ^ ^ - . - 

m^i ITie mean for criterion i. - ^N .' - ^ 

TTie variance for criterion I. - 

Th§ constant term ui the repesaion equation for criterion i. 

bj. Hie p3d vector of recession wei^ts for criterion i. 

, The p3d vector of standard rep^ion wil^ts for criterion i. 

C|, The p3d victor of validities (intercorrelations between the criterion and the p independent variables) 
for criterion i* 

Nj The total number of cases N - ni+n2+ +nj^ ^ ^ ■ ' 

The pooled criterion mean Nm^ = ni mi +n2 m^ + , . . 
0^, TTie pooled criterion variance 

Na^ « ni (of 1 +mi ) + . . , +nk(^k+nik) ^ ^"^o ^ 
g|j The number of criteria in cluster L 

I, The set of criteria in cluster L I ^ \ U , ii\ , . *gj}. In the succeeding deflnltions, let I be the Union 
. of clusters J ariid J U 

N|, TTie number of cases used in the computationiOf the composite equation fof cluster L j ^ 

Nj- 2 rii - Nj+Nl - . , 

Mj, The criterion mean for cluster 1. 

NjMi^ S njiTii = NjMj+NlMl 
iel ^ 

Op The criterion variance for cluster I, 

NjffJ" 2 nj(a.^) = N,Mi =Nj(aj+Mj)+NL(oL+ML) = NiMj' 

Sj, The constant term the recession eauation for cluster L 

iel 

bp TTie pxl vector of repesaon wei^ts for cluster L 

.ISlbi = l 4i = Njbj + N^bL 
ifl 



ERIC 



Ijj The pxl vector of standard recession wei^ts for duster I 

ifi 

cj, Thg pxl vector of validities for cluster L 
ifl 

R|s The squared njiJtiple correlation coefficient for the regression, on criterion i. 

2 

R|, The squared multiple correlation coefficient for the regression on cluster L ^ 

Gg, The set of s criterion clusters prfesent at the s cluster stage, 

^R^ , The squared multiple cbrrelation cbefflcignt for the criterion grouping, G^, at the s cluster stage. 

' . ■ led,. ' . - 

LetCj = I J, L,K3,..., } and 
Gj^j = p UL.Kj, . . ., Kjthen 

Na»(Nj+N^_) ' ■ • 



( ( 
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■* /lPP£AfD/A'B.MATHEMATICAi\^ BACKGROUND 

\ 

Mathimatl^ Model for the QUltemg Algorithm 



Suppose that: a set of p independent variables, v' ='{vi , . .... are linearly related to the expected 
ydues of each of k criteria* Yi I ... Yj^; that is* 

(1) E(Y.|v>^ v'bj + ai 'fbri=l,.,^k 

where 




Xi 

the set of p indepe _ . , ^ 

let U| b0 an n|Xl vector in which each element is 1 . TTien from (1), 

(la) ^ ^ "ft ' ^ - 

Let N ^ ni+ . . * +n|^; let Y'sfyj , . . yQ, the IxN vector obtained by pooling dl the criterion obseryationi^ 
let / L »y 



let 



UiXi .0 0 . ' 0 

0 u^Xa 0 . , . 0 
0 . , . . . 



the NKk{p+l) block' dia|ohal matrix obtained by placing the n|X(p+l) matrm of oteeivationi[U|XJin the 
Mh block diagond position, and let b'^^i b^i . , . ^k^k]^ ^® k(p+l) vector of unknown paramAers; Under . 
the assumption that the criterion observationi are independent and have common variance, the 
mathematical model for the clustenng algorithm is 

(lb) ^|X) - XbwithD(Y|X)*a^I. . 
where D(Y(X) is the (topersion matrix of the criterion observationSi is .the common variance* and I is the 
NxN identity m|trix. 

Minimum V^an^ Unbiased Estimation and Hypothesis Teitiiig 

The k(frt-l)?tl wtor b of unknown parameters in (lb) correspond to the k e^uarions In (la). The 
minimum variance unbiased estimates (mvue), S- and 6|* of and bj are obtained from (lb) by the method 
of least squares* where 

h - [xjXi - ^ )quju;Xi] =1 [xjyj - ^ ^^^Vifi^ ^ , ^ ^ ^ 

(2) r , 1 , . " fori-l*...,k. 
h - n, ^ n^ UjXjbi 

These a're the estimates that would be obt^ned by the method of least squares from the k separate models 

(3) E(y||X|)-XjbjtUjaiWithD(yj|Xj)-a'l fori-l,...,k 

where the error variance, * is tiie same for each m^deL It ml^t be that some or all of the equations in (1) 
are identical. Hie technique of homogeneity of recession can be used to*test the equality of vectors o^ 
repession parameters across several criteria, Qiipman and Rao (1964) and TTieil (1970) ime developed^ 
methods for obtaining mvue under general linear restrictions and for testing generd linear hypotheses. Rao 
(1965, pp 1 89-190) shows that in the case ^ ^ 

(4) E(Y IX) - Xb With D(Y IX) = aM, 

where X is nxs of rank s and b is sxl, the mvue, b^ for b under the linear restriction 
^ (4a) *b-Ois i ' " * = 

(4b) b^ ^ B(tf X XB)^Ib'X'Y ^ . 

19 



" where ^ is rxi of rank r, B is sx(s^r) of rank (s^r)^and *B = 0, Rao obtains this resuh by intreducing the 
general solutioi|, where 0 is an (s^r)xl vector ,of new parameters, of (4a) into (4) to obtain the m<rfel 

(5) ^YIX)-XM wi^ D(m^^ 

and no restrictions on fl/'The i4vue, of B$ is BO (see Rao, 1965, pp. 181=182), where B is the n>vue of 
6 in (5). If, in addition to (4), Y has a multivariate normal distribution, then Chr^ai and Rao develop an 
expression for an unbiased critical region of size 0 for the following hypothesis f ^ 

(6) ^1 b ^ 0 given that ^^b - 0 

it 

where ^fli rixs of rank ri , 4'^ is r^xs of rank r^, and ^' ^O^q^i] is sx (rQ+r,) of rank (rQ+ri ), The 
expression for the unbiased critical region of size $ is 

/ \ / l^R|, 



(7) |F|F - 



fn-s+r 



EXSS 
ESSt 



where (rnn-s+r^) is the upper 100 (1-^)% point of the central F distribution with r, and n-s+r^ 
degrees of freedom, and - ' 

, ESSH = (Y^Xb,p )'0«=Xb^ ), i » . 

^ o ^ 0 I 

EXSS = (Y-Xb^)'(Y-Xb,|,) - ES.^H, 

b^ is the mvue of b under the restriction 4^pb - 0, 

^ o ^ ■ 

b^ is the mvue of b under the restriction 4'b = 0, 

2 

is the squared multiple correlation under the restriction 

0 

^^brO, and 

R^ is^Ke squaired multiple correfation under the restriction ' . " 

^b-6. I ' 

I I ^ 

The Chipman and Rao computational fgrm for F is different from the form in (7), but the two are 

equivalent. (See Rao J 965, pp. 199-200)= ^< 



The restriction oei =^ 



(8) (t-l)(p+l) 



MVUE for a Criterion Cli^ter 

= and bi * ^ . , . = b^ can be expressed in the form 4^b - 0 as 



1 I 0 . . 

-I 0 



I 0 



0 

-I 0 



t(p+l) 



(k^t)(p+r) 



where I is the (p+l)x(p+r) identity matrix. To express model (lb) in a form simijar to equation (5) under 
the above restriction (8), the k(p+l)x(k-t+lXp+l) matrix B, where 
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t(p4r) (k-t)(p+i) 



I 0 
0 I 



(k-t+l)(p+l) 



and the (k— t+lXp+1) vector of new parameters 6, where 

Is introduced Into (lb), to yield Uie model 

u.Xi 



(9) . E(Y|X1 = 



"t+1%1 



bjj, 
«t+l 

•'t+l 



b,. 



with DfYIX) = I. 



TIte tffect of B is to pool the observations for criteria 1, , . „ t. The nivuc 3^, and b^,, for the criterion 
cluster, (I, 2, . . ., t) formed from criteria 1, , „ t can be calculated in either of two ways: pool the 
obs'ervalions as in (9) and compute and %^ from tlie normal equations 



III uiXi 

x/Lirx,''x, 



nj UjXj 



4t ^t^t 



uiyi 
Xiyi 



4t 



or if the predictor sumsK^f^quares and cross-product matrices are proportional, i,e,. 



111 UiXi 



"2 



TXl \h X2 



X2 112 Xj Xj 



Xt»tXtXt 



then a^, and b,y can be calculated from S, /b| , _ 5^, and bj given in (2^ without forniing the sum of 
matrices on the left hand side in (9a). Using (9b) this sum of matrices is 



(9c) 



ni U|Xi 
XiUiXiyi 



^"t >^t^t 



where Nj * m +03+ > - ■ +nj. Using (9f)^ solution of (9a) is 



njU-Xj 
4i^^i 



for i ^ 1, 



n '1 



21 



Thus, the mvue for a criterion cluster are 



(10) 



% 



When (9b) holds, the fbrinula fbr the standardized rcgre^ion weigfiu for a criterion cluster is easy to 
obtain. Let 4^/. . - It standardized weights corresponding to the raw wei^tshxj,, bj , , . f)j; let 
be the pxp diagonal matrix witli its elements equal to tlio standard deviations calculated from the 
observation matrix Xj fbr the p independent variables; let be the pxp diagonal matrix with its elements 
equal to the standard deviations calculated from the pooled observation matrix [XiXj , . , Xj] for the p 
indepihdent variables; and let a^, Oi^. . be the ^iple variances for tlie vcctorB of criterion 
observations [yi Vi , . . yll \ y\ ^ ^ t^Vt, respectively. By definition tjic standardized weights are 



% ^^t 

From (9b), ^ Q| ^ . = therefore using (10), the fornuila for the standardized weighLs for a 
criterion duster is 

%/ ^ NjO^, (n, CJi a, + , ^ . + 



e Correlation Coefficient for a Criterion Cluster 

Let R,];, Rt . , M rJ be the squared multiple corrulation coeFricients for the criterion cluster fonned 
frbm critbria 1, . . . t and for the t criteria yj, .... y^, respectively; kt Cj be the pxl vector of 
intercorrelations calculated from the observation? Xj and y| betsvecn the p independent variables and the 
i-th criterion; and let Csy be the pxl vector of intercorrelations calculated From the pooled observations 
[XlXi , . Xj)' and [yjy^ • y^ 1' between the p independent variables and tli&criterion cluster (1,2, , . 
t). By definition, 

i . . 1 . ^ , 

From (9c), ^ IX|U| + .,. + XjUj I XjU- .fofi^l, ._,t. Theroforc, 
put Qxj, ^ Qi ^ . . . ^ Q| so the \iilidity cocfncieuts for a criterion cluster are 



n 



1 



The squwed inulUpIc correlation coelTicicnf f(ir the cluster 

(U2 t)ls • 

llyiKHhesi^ Testing \ 

Hie cfiti^il region ^iven iti (7) for tliC hyiH^hcsis ((>) requires the calculalion t^f the err«u suin of 
s<iuarcs or the s<iuurctl mijtiple cnfrelatinn cc^cincicnt ibr nu>t!cl (lb) when rtsttictioiB fye imposed m\ the 
unknown parameters ll^e crrof sum ot sqiuto, I'SS, Inr nunlcl ( lb) when there [ue no restnctinns^m the 
unknown parameters is eqiial to the simi of the error Hutn of squares. FSS|, i'ui the k nuHicU (see (3)). i e., 

' ESS ^ liSS, + E'SSi + , . . t liSS^. 

Let m^j and be the criterion nieun and variatiuu calculated from the ixiolcd cnterinn ubservation vector 
Y, and let mt ,\ . ^, be the criterion nieans tbr yi . . . , y^, resi?ectively. Then 

l^SS| - n^{\ R^) fori-L . k 



Nni.^ - n,iui + iht^h + ^ "k^^k 
Na^^ - ni(o^ + m| ) + . . ^ "k.^^\^^"k^ ^"-o 
Therelbre the squarcil nuiltlple correlation, Jur( lb) is 

ii _ " ^ L * fi. J 

llie error sum of squares, bSSII, for(^0 is 
whetc !'SS,j, - Njf^j,( 1 R,j/). liierel^ue the squareU inulti|ile coiielatiofu H^^, ior (0) is 



(ID ^ 



■ o 



tiicjpfx) thesis (8) cau be iesteti at the a siKfutkance level by coiuputitu* 

\(t n(pH)/ \ I R' / 

and rejecting (8) if T exceeds tlic 10()( tvy:< point of the eential V distiibutioit with (t iKpH ) and N 
k(p+l) degrees of Ireedntn, 

Affiilication to m l'<nir Criteria MutWI; A VVorked Lxampic 

(llveu tbiir crlleria yi. Vj, y^. and y.t, where Vj is an n:Xl vector of obser^tion^^, and tlie piedictoi 
nuitria4?a^X|. Xj, Ku ^4 • U an n^\^ mim of obseivations on p indepeiulent variables, the 
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greitest predictive power is attained when cai-h critcruui variahlc U prcflictcd (nnn iiu tcgicmim m the 
Independent variables. The initial stage. \c , St;ipe4, enipluy^ Ibllowing luqdcl; 
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mvue a= mM\ b-, for a, and 


b. ail' obtained 


froi 


] (2) and the 
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. for nunjel ( 1 
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I'or Sfuge 3, a^siinie l')h) lutKIs tin Xi", X,. X\, anil S^. Under the linear liyp<nlicsis cn ^^j and 
bj -bj, the mvue . and b ^ , , for the criterion cluster ( 1 .2) lorined fronvcritena I uru! 2 ;ire (set ( 10)) 
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llie stafuiard weights. and the validities, c, , , lor iht^ rhisiei i IJ) are (^ee ( lOa) aiul ( lOb)) 
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V.1 



(11 1 Ml J I k| ^ f in j . ) f n i (rf| H^^ t in'^ ) ^ n.j («'^^ U'. hu^^ ) Nin^ 



WIUM 



tfijini Huinj), N ■ Uj * fij'tiiin.v , and 
(ih Ml, ) 



hj nil ni J ♦ n uo i t ili tn.i 



(Ua)caii now be used to test al the a signincancc level thq hyMhesis HI : Ofi ^3 and b, =b2 by coniputing, 



■aiul rejecting lil if F exceeds F^(p+K^ ^ ^ 

For Stage 2. accepting III as true, the additional i^str^iob'aami and bafb^ areim^sed and the 
mvue, and b , Xox the criterion cluster (3,4) formed from ertteria 3 and 4 arc 



n4 
Ua +n4 



Tli^standard weights, p^^. nnd the vnlidities, c^^ .tor the cluster (3,4) are 

^ 1 ^ ^ ^ 

- - -— ^ (n^o^p, + n4C7rtj34). arid 

(nj +iu)c?,4 

^ ^ V (n,r;3C3 +n4C74C4). where 

(Hi i-ri4K?,4 



(nt+n4k; -. - n,(cj +ni )hi4(a,+in ) 
ITic model used to obtain these^estiniates is 
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(na+n4) 



(14) F 







Ui Xi 






u^Xj 


y.j 




0 


y4 







u ^ X .i 
U4 X4 



^ b 



1 2 



rxn 


















with D 










K] 
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[lie sqiiarca multipk- currelation cuelTicicnt, j , (14) is (Irom (I I) with k-2) 
\{n , +nj K''^ J R; , , ) + ("J +"-. )(o'4 4 '■•"34 )^NmJ^ 



T 



where r'^, K.i'^iA^ (niHulni..^ - n^myhum^. Ikiuation (I la) cnn now be u^ed to test at the a 
sif;niticance level the hypothesis ^ ^ ^ * 



!U; "rHtj and bY^^b4 given HI is tnie by computing 



(pH) 



;Hid rejecting H> if F exceeds l'^^(pH,N .^(ptD). 

Fcjuati^n ( I In) c*4n alst> be used to tost (he liypogiesis 
I U : tV| lu^i . \ 04 . and b^ ^Im by couiputing 



MMt nje<tln8 H3 if P Axceeds Fa(2{p+ 1). N-4<p+ 1 )), 

For Stage I, icwptlng Il2 «triue, Uie additional rcsW^ jrc lmpore^«n<l 

the mw*«,a,^,^ andl,^,^, forthicrllerton elustor( 1,2,3|4) fforrned ftoin fouicii loriaarc 
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TTie itandarfi "weights, |, ^j,, and tlie valid ItlcSiC, , for the cluster ( 1,2,3,4) are 
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i23i ^ HS^^ (ni+iii)o,,3,,=t-(n3^n4>,^lj4 ,and 
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«i33l* Ho\ '(ni+4ij.)c,jC,j+(n3+n4 0o,^C3^, Where 
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(15) E 



The squared multiple correlation coeffidenl, j tfor ( IS) is 

Equat|oim_ (1 1 a) canHow be used to test at tlic ce ii^Jricaiice level the hy potHesis 

CEj ^^^^ and bj^^bj^, glvenoEi ^2 ^4 ^bi gind ^3^64 by compiitiri^ 



F= /N-2(p+i) 



and rcjeciiiig H4if F exceeds F^(p-Hl, N-^2Cp+l )). Tlic fiypathesii 

HS: ai^2^3^4 ^dbi^b3=b3^b4 
cm be tastid at tha « ^piitonce l^vel by computing 

and rejecting H5 if F ejtceeds PqCSCp^I )i K-4(p+l 
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